Cloud-based healthcare framework for real-time anomaly detection and classification of 1-D ECG signals

Real-time data collection and pre-processing have enabled the recognition, realization, and prediction of diseases by extracting and analysing the important features of physiological data. In this research, an intelligent end-to-end system for anomaly detection and classification of raw, one-dimensional (1D) electrocardiogram (ECG) signals is given to assess cardiovascular activity automatically. The acquired raw ECG data is pre-processed carefully before storing it in the cloud, and then deeply analyzed for anomaly detection. A deep learning-based auto-encoder(AE) algorithm is applied for the anomaly detection of 1D ECG time-series signals. As a next step, the implemented system identifies it by a multi-label classification algorithm. To improve the classification accuracy and model robustness the improved feature-engineered parameters of the large and diverse datasets have been incorporated. The training has been done using the amazon web service (AWS) machine learning services and cloud-based storage for a unified solution. Multi-class classification of raw ECG signals is challenging due to a large number of possible label combinations and noise susceptibility. To overcome this problem, a performance comparison of a large set of machine algorithms in terms of classification accuracy is presented on an improved feature-engineered dataset. The proposed system reduces the raw signal size up to 95% using wavelet time scattering features to make it less compute-intensive. The results show that among several state-of-the-art techniques, the long short-term memory (LSTM) method has shown 100% classification accuracy, and an F1 score on the three-class test dataset. The ECG signal anomaly detection algorithm shows 98% accuracy using deep LSTM auto-encoders with a reconstructed error threshold of 0.02 in terms of absolute error loss. Our approach provides performance and predictive improvement with an average mean absolute error loss of 0.0072 for normal signals and 0.078 for anomalous signals.


Introduction
Physiological signals carry information on the electrical activity taking place in different parts of a human body [1,2]. Traditionally, this information is analyzed by healthcare workers to a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 evaluate the physiological condition of the patients, and the analysis helps them in decisionmaking [3]. These decisions have far-reaching consequences on diagnosis, treatment monitoring, drug efficacy tests, and quality of life. Physiological signals such as the electrocardiogram (ECG) carry information on the human heart's electrical activity, and its analysis helps in understanding the cardiovascular health of the patient [4][5][6][7]. Recently, a biometric authentication system has been developed based on a person's ECG to make it more secure [8]. The recording is performed by conventionally attaching leads to the surface of the human body over specific areas. The graph is then examined by expert healthcare professionals to identify abnormalities or anomalies. The inclusion of computer vision technology and deep learning algorithms specifically has made the process of ECG anomaly detection a matter of a few minutes. The significance of automated detection has been validated in this COVID-19 pandemic when personal visits to a hospital are riskier than the problem itself. This research is an extension of our already published work where we have implemented and analyzed an IoT-based healthcare system, and important features from the biomedical sensor data have been extracted by applying signal processing techniques for anomaly detection [9].
In recent years, the internet of things (IoT) has emerged as one of the most essential communication paradigms [10][11][12]. IoT can assist the healthcare industry due to its access to information and its ability to ensure connectivity. IoT applications in healthcare have helped people keep track of their medical requirements, such as reminding them of appointments, keeping a check on calorie count, variations in blood pressure, a check on exercises, and many more [13]. The IoT-based healthcare system field comprises sensors, microcontrollers, and other electronic devices. These devices are capable of communicating with each other. The data received from these sensors (which are mostly biomedical sensors) are stored in the cloud for analysis and monitoring performed by medical professionals or attendants. Integrating IoT structures into medical devices advances the quality and service of care for aged patients and children [14]. On-body non-invasive sensors monitor the physiological activities of the human body, and the information is stored and analyzed for the realization, detection, and potential treatment of diseases.
Over the past few years, there have been many advancements in the design of IoT-based systems, which has spurred the use and development of smart systems in various biomedicalrelated processes. Ultimately, it has supported the improvement of the healthcare system by making it intelligent and manageable. Many examples are describing the effectiveness of smart healthcare, including tracking of patients/biomedical equipment, automatic identification, correct prescription of drugs for patients, and monitoring of physiological parameters of patients in real-time. The rapid increase in cardiovascular disease (CVD) patients has led to the quality study of ECG-acquired patient data and continuous monitoring using the cuttingedge technologies [15][16][17].
In today's digital world, artificial intelligence-based techniques especially deep learning algorithms, have influenced every data-driven field and application including augmented reality, video gaming, face detection and recognition, e-health, etc. Deep learning algorithms have great potential to analyze textual data, as well as video data for gathering relevant information [18,19]. In advanced IoT systems, the challenges are continuously changing and with them, intrusion threats are also rising. Deep learning algorithms can extract the features and identify the anomalies for intrusion detection and classification. In smart home automation and security systems, deep-learning based methods can learn and detect and classify movement patterns for intruder detection. [20,21]. Recently, in mobile edge computing (MEC) systems, reduced communication latency and offloading computation can be manifested by the reinforcement learning method. In block-chain enabled MEC systems, the reinforcement learning technique is applied to create a secure resource allocation and task offloading in a multi-user system. [19,22]. In healthcare, they can assist medical staff in the perception, recognition, and prediction of diseases and would also help in decision-making. According to a survey [23], approximately 400,000 people have died in the US due to incorrect diagnoses and decisionmaking errors of the practitioners. Deep learning algorithms can reduce the diagnosis time and cost and would also help in providing accurate treatment [24,25]. Time series classification (TSC) of patients' physiological data helps to recognize and realize anomalies in physiological features by learning and identifying the temporal patterns of the acquired real-time data. TSC of ECG data utilizes the temporal order of the data for analysis, and each time the task involves human cognition [26][27][28].
Early and accurate detection of a CVD along with its precise treatment can prevent probable deaths. The conventional clinical practice is a tedious and time-consuming process since the CVD diagnosis relies on observing the morphological features of the ECG per-beat recordings, thus an automatic ECG anomaly detection method is desirable [29,30]. In this research, we have presented an intelligent, automatic 1D ECG anomaly detection and classification method based on AI-based algorithms. The system performs better by improving the featureengineered parameters of large and diverse datasets. Healthcare data is often in huge volume and to manage long-term batch processing of stored data, an AWS cloud-based architecture is proposed to improve the patient's data availability, easy data access, and model robustness. Multiclass classification is a challenge due to the contribution of a large and diverse dataset, a wavelet time scattering technique is implemented to decrease the data dimension and make the classification process time-efficient.
This research is an extension of our already published work where we have implemented and analyzed an IoT-based healthcare system, and real-time features from the biomedical sensor data have been extracted by applying signal processing techniques for anomaly detection [9]. The key contributions of this research work include the following: 1. Design and implementation of an end-to-end AWS cloud-based health monitoring framework. The data from on-body sensors have been acquired and stored for real-time monitoring using big data cloud computing techniques. This paper presents an end-to-end AI-edge platform to find the optimized algorithm by comparative investigation of automatic ECG signal classifications.
2. Improved feature set has been developed by combining the carefully crafting hand-engineered features with automatically detected features through AI-based algorithms.
3. The implemented architecture detects anomalies automatically with the help of unsupervised deep LSTM auto-encoders (AEs). LSTM-AE minimizes the loss ratio to decrease the reconstruction error during the training stages on the encoder/decoder. 4. Wavelet time scattering feature extraction is implemented to reduce the signal size to increase the real-time processing speed.
In the remaining part of this article, we present the related study in section II, design, and implementation in section III. This section also includes signal processing of acquired data and time series classification and based on the analysis of signal data, the classification results are discussed in section IV. We then conclude the article in section V.

Related work
Temporal data have varying abstraction and patterns, and learning the important features for the classification of time series data is a problem that has been addressed by researchers in the past few years. Time series classification groups the temporal data based on extracted features into classes for identification and recognition. In this research, a comparative analysis of more than twenty machine learning is presented. A cloud-based healthcare framework is implemented with improved feature extraction, and extensive machine/deep learning algorithms are applied to learn and train the extracted data for automatic time series classification. Preprocessing and feature extraction through applying signal processing techniques would make the feature extraction process much more reliable and refined. The researchers have invested a good amount of time in analysing signal processing techniques for the construction of structural health monitoring (SHM) for damage detection [31]. A simple Butterworth filter was used to denoise the signal cross-correlation to detect the extent of the damage.
Detection and classification of physiological signals such as ECG for heartbeat diagnosis that has utilized discrete wavelet transform (DWT) and support vector machine (SVM) have resulted in 98.8% classification accuracy [32]. The limitation of such work is usually the vector size. The DWT technique used in this work, lacks the shift-invariance property which affects the performance and accuracy of classification. Other aspects, including the security of industrial IoT-based healthcare, have been explored with the inclusion of encrypted techniques such as watermarking for the prevention of theft. This work doesn't highlight the efficient data management technique. A large amount of data collected by body sensors in IoT-based healthcare should be managed properly. For this reason, big data analytic techniques have been introduced in healthcare organizations [33][34][35][36].
Deep learning algorithms have made their mark in analysing large and varied datasets compared to other classification methods. This is because deep learning algorithms can avail all the available input during the development process [1]. Biomedical signals including electroencephalogram (EEG), ECG, electromyograph (EMG), etc. have been analyzed for this purpose. In another study [14,37], a healthcare system for critical patients and this system was to be operated on by medical attendants/nurses anywhere. The system has a smartphone application developed to operate with biomedical sensors for acquiring the required data and dedicated health servers. However, a major limitation is the case of GPS/network unavailability and the system's response time.
The process of classification of a time series involves labeling or assigning a class to the data/time series. As the accessibility of temporal data has increased [38], a large number of algorithms have been proposed to address the classification problem. Generally, TSC can be categorized into feature-based, model-based, and distance-based methods [26,[39][40][41].
Over the past few years, with advancements in deep neural networks, deep learning algorithms have been introduced to solve time series classification problems with algorithms such as recurrent neural networks (RNNs). The convolutional neural networks (CNNs) have been shown to provide state-of-the-art results on challenging activity recognition tasks with feature engineering and classification tasks in a single algorithm [26,42,43]. In deep learning algorithms, the auto-encoder (AE) is an unsupervised way of learning the features of the training data [44]. It reconstructs the input data as an output. Unsupervised AE can learn the important input features by itself by utilizing nonlinear mapping without even labeling the data and then reconstructing the input data. A convolutional auto-encoder is a modified type of AE where a 1D-CNN is used to build a layer that is integrated to extract the important features from raw input data while a 2D-CNN layer is added for image processing [45]. The CNNs have the ability to use the ECG-acquired data as a set of segments to apply 1D CNN, as well as convert it into the image and apply 2D-CNN. To overcome the temporal dependencies, two or more neural networks can be integrated. In [46], CNNs have been used to learn the spatial properties and RNNs have been applied to temporal features. In another research [47], intervals of an ECG signal have been acquired from a 2D image. CNN's have been implanted for feature extraction while the ECG signal has been decomposed into its components by applying continuous wavelet transform (CWT) for arrhythmia classification. Arrhythmia is the detection of an irregular heartbeat. However, applying CWT for arrhythmia classification can be computationally expensive. In a similar research [48], R-peak has been detected from ECG signal for arrhythmia detection. The classification has been done using CNN and random forest techniques. In [49], the classification accuracy of automatic arrhythmia detection has been improved by combining the temporal (FS1) and frequency domain (FS2) features. The classification has been done using DTCWT and a random forest classifier. The training process in this work depends on the labeling of the ECG data. A hybrid deep CNN algorithm can detect and classify different heartbeats through real-time ECG data for the detection of any abnormality in heartbeats. [50] In an entropy-based ECG classification, [51], An ECG dataset has been trained in a three neural network-based architecture. First CNN-based, second SincNet based, and third the entropy-induced CNN-based architecture for better computational efficiency. In [52], a solution is given for improving classification accuracy in the case of limited training data available. They have utilized Stockwell transform (ST) technique along with a two-dimensional residual network (2D-ResNet).
The above literature gives deep insights into the contribution of AI technologies in the present healthcare system. However, for long-term batch processing of large volumes of available healthcare data, an end-to-end system is required. The system should be able to store raw data in the big-data cloud, pre-process it for feature extraction and anomaly detection, and an automatic ECG signal classification on the detection of an anomaly, all in a single go. The available literature lacks a multi-function, end-to-end, cloud-based system to deal with continuous temporal changes in a patient's ECG signal. Moreover, most of the previous research work relies on massive labeled ECG data for training. One of the advantages of this research appears as the multiple steps data processing has been replaced with raw, 1D time series ECG signals to assess cardiovascular activity efficiently. For the time series raw ECG signals, LSTM-based deep learning methods show better accuracy [53], therefore, for automatic anomaly detection of 1D raw ECG time series signals, LSTM-AE is implemented in this research to achieve high accuracy and robustness. For multiclass ECG classification, a large and diverse dataset affects the classification accuracy thus an effective signal reduction technique is required to make the classification process time efficient. Another advantage of this work is that a large volume of healthcare data has been stored in the cloud for long-term batch processing and research analysis. This makes the system more accessible to healthcare experts and researchers.

Methods and materials
This work aims to design and implement a cloud-based AI-enabled healthcare system with the ability to combine and enable different, yet complementary technologies. The end-to-end improved automatic ECG classification system in Fig 1, would collect, store, and analyze the raw 1D ECG signals/data about the physiological parameters of a patient and deliver them to a signal processing unit. The ECG data is analyzed for the detection of anomalies, the data is pre-processed for hand-engineered features to make a feature set. Next, the wavelet transform is applied to ECG data to reduce its size to make the system computationally efficient, and then classification takes place in the final step.
Then, finding anomalies in the patient data and sending alerts to a medical professional, helping patients take an active role in managing their health in an end-to-end automatic system.
This system comprises raw ECG data, which is often in huge volume, and to manage longterm batch processing of stored data, an AWS [54] cloud-based architecture is proposed. Fig 2 shows the design of the cloud-based healthcare system and signals analysis in the AWS cloud using different data ingestion and analysis tools. Anomaly detection of acquired ECG signals is achieved by a two-step process: 1. Real-time ECG signal anomaly detection. 2. Batch processing for the classification of ECG signals.
In [4] the ECG data is retrieved and processed in LabVIEW for anomaly detection in physiological signals including an ECG for analyzing the heart's health. The physiological signals used in this research were collected from Physionet [55], the data scarcity has been avoided by merging three different datasets and extracting 162 ECG recordings for three different classes [56-58]. To make it free from baseline wanderings and other motion artifacts, wavelet transform was used. The wavelet transform has its significance due to its better localization properties in both the time and frequency domains [59]. The proposed architecture in the system has used streaming analytics tools of AWS cloud to detect the real-time anomaly existing in the ECG signals. Kinesis data streams [60] are used to extract the data from the ECG logs and feed them to the kinesis data analytics for anomaly detection.
AWS Lambda function is used to detect anomalies in the ECG signal and trigger the Amazon Simple Notification Service (Amazon SNS). Amazon SNS is a fully managed messaging service and communication for application-to-person (A2P) and application-to-application (A2A). In the designed healthcare framework application A2P service is utilized to inform the patient and the doctor-in-the-loop in the case of any anomaly found in the ECG signal. A trigger is also generated in the batch processing layer to further classify the ECG signals to predict the relevant signal class and displayed it to the doctor. This helps in the decision-making process.
For the classification layer, AWS Kinesis Data Firehose is used to feed the logs to the spark clusters for the preprocessing and feature extraction. After extraction of relevant features, these data logs are saved in an amazon simple storage service (Amazon S3) bucket. Amazon SageMaker is deployed on these featured logs to classify the ECG signal classes. Amazon Sage-Maker is a fully managed machine learning service that helps to easily build and deploy machine learning models in a production-ready hosted environment.

Data pre-processing
The acquired data are filtered in the first step of the analysis, which includes detrending and denoising of the acquired signals to accurately compare the minimum threshold values already computed into the designed framework.

Data pipeline
The processed ECG signals are fed into amazon's Kinesis data streams for creating a real-time pipeline for data analytics. Amazon Kinesis is self-managed, auto-scale Amazon Kinesis data streams have different components like kinesis producers and kinesis consumer's library which produce and consume data logs. Producers send the data in data streams which are divided into the data at the rate of 1Mb/sec or 1000 messages at write per shard and consumers can consume the data at 2Mb/s across all the users. These features enhance the capability of the healthcare framework by incorporating a central repository to access the data across different locations by different consumers. One important feature of the proposed design is that the  data retention period is 24 hours and can be further extended to 7 days if the data is not ingested by the consumers.

Data reduction
Carefully extracting the important features for training the ECG data is a very significant step in the classification process. Methods such as autoregressive model (AR) [61] and Shannon entropy (SE) [62,63] have been used for ECG classification in literature. In this research in order to include the temporal response of the signal wavelet scattering transform has been implemented for careful ECG feature extraction. The wavelet transform computes the inner products of a time series signal [64]. Transient ECG signals are present due to the presence of heartbeats. These transients are generally not smooth and are of short duration. Transients are precisely captured by wavelets due to their flexibility in shape and short duration.
For the analysis of the signal f(t), the frequencies have been covered by wavelet function ψ and low-pass filter ϕ, a group of wavelet indices A k having octave frequencies. The wavelet transform is for generating the local invariant feature of f is given as: A wavelet transform for the lost higher frequencies in this process is given as: By taking the average of wavelet modulus coefficients, the first-order scattering coefficients can be computed as: Complimentary high-frequency components can be computed as: The second-order coefficients can be computed as: The final scattering matrix for mth-order scattering coefficients is given as: The repeated application of modulus of the wavelet transform moves the energy contained in the high frequencies of the initial signal towards the low frequencies. At each step of the scattering transform, the energy in the lowest frequency band is output by convolution with ϕ j . The coefficients can be referred to as scattering features and they are down-sampled to reduce the computational complexity of the network.

Anomaly detection
An autoencoder (AE) is a type of neural network that comes under the category of an unsupervised learning technique, it is capable of learning the best encoding-decoding scheme from the given data. The encoder present in the structure compresses the input data into the latent space while the decoder decompresses the encoded representation into the output layer. The weights in the neural network are adjusted according to the propagated difference error created when the encoded-decoded output is compared with the input data. The setup for a generative model for a latent variable Z is given as Z ! f ϕ ! X, where f ϕ is a neural network with parameters ϕ and X be a parametric random variable whose distribution depends on f ϕ (z). This model is fit to a dataset x 1 , . . ., x n and finds the maximum likelihood estimate of the data.
LðyÞ ¼ log Prðx 1 ; :::; x n ; yÞ ð7Þ LðyÞ ¼ LðyÞ Where i is all the elements in the vector. An AE is simply a feed-forward deep-neural network where instead of predicting an output as the class label, we are reconstructing the input signal x. So, to train an auto-encoder we need to define a loss function that would compare the input signal x and reconstructed signalx. The loss function for real-valued is given as:

ECG classification
This research work presents an extensive comparative analysis of artificial intelligence (AI) based algorithms including both machine learning and deep learning algorithms, to find the best possible option for the ECG classification model into three classes: arrhythmia (ARR), congestive heart failure (CHF), and normal sinus rhythm (NSR). The signal length of each of these samples is 65,536 samples long, and we have 162 records of ECG data. A typical, easy way to train machine learning models is to directly feed our raw ECG signals in a machine learning classification algorithm, which would give us outputs such as the ARR and CHF. However, unfortunately, this approach does not work, probably because the signals are lengthy and the features of the signals change very rapidly with time, which the machine learning model is unable to interpret. We have added the step of feature engineering to extract the features from the ECG signals and use those features to train our machine learning algorithms, which helps build this model with higher accuracy. The feature engineering step has numerous advantages and numerous aspects because it reduces the dimensions of the input data. Then, feature selection is also a part of feature engineering, which allows us to select the features that are more prominent or have a greater effect on training the machine learning models, and we can eliminate the other features, which are probably adding unwanted signals to the input data set. K-Nearest neighbor (KNN) method. The k-nearest neighbor (KNN) classification method is a machine learning algorithm that works without a training stage step. The classification tasks are performed by calculating the distance between the test sample and all training samples as a first step, to obtain its nearest neighbors.
This method includes two parts: a) determining k-close neighbors, b) determining class type using these close neighbors. Consider the training data space D described as D = X 1 , X 2 , X 3 , . . .. . . X n , Which includes n samples, and each sample X i is defined as: Considering three different classes in the training data, to determine the particular classx in the given data, we need to know its distance from each element in the data present in space D. One of the most common methods for calculating the distance betweenx and x i is the Euclidean distance criteria. Euclidean distance is defined as the given equation.
d ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Support vector machine (SVM) method. One of the important machine learning algorithms for the classification of pattern recognition is the support vector machine (SVM) method. An SVM is a supervised machine-learning algorithm that can be employed for both classification and regression purposes. The SVM algorithm has been used widely while dealing with real-time data problems. Support vector machines are binary classifiers. However, in real cases, data is to be classified into more than 2 classes. This is done by using multi-class SVM. This can be done by using the one-against-one (OAO) approach. According to that approach, for a 3-class classifier as required in our case, (c(c − 1))/2 number of SVMs are trained. In our case of multi-class SVM, c is 3.
RNN-LSTM method. The long short-term memory (LSTM) network is a type of recurrent neural network (RNN). Instead of mapping inputs to outputs alone, RNN contains cycles of mapping function for the inputs over the previous time steps to an output. Training of RNN has been a challenge to the researchers due to either vanishing gradients or exploding gradients [65]. In vanishing gradients, weight updating resulted in smaller weights and have no further impacts. Similarly, exploding gradients have weights updating resulting in overflow. The architecture of the LSTM memory cell is shown in Fig 5 LSTM-RNN architecture models a nonlinear dynamic system by mapping the input sequence to the output sequence. The input vector at sampling interval t is concatenated with the LSTM cell hidden state h(t − 1), state of the previous sample. This input is passed through the tanh activation function and at the second step combined input is passed via the input gate which is a layer of sigmoid function which controls the input flow by making input gate weights close to 0 or 1. In the next step, data flows in the cell through the internal state's t0 and forget gate which also helps to drop unwanted input. Input data is added to the one-step lagged s(t − 1) recursively.
Finally, the output state is reversed by the tanh function which is controlled by the output gate. In the LSTM memory cell, the input, output and forget gates allow the LSTM to forget or write new information to the memory cell. Mathematical equations of the memory cell are given below: Where o, i, f, c are the output, input, forget gates, and memory cell of the network. The architecture of a simple LSTM network for classification problems starts with a sequence input layer followed by an LSTM layer.
Kinesis data analytics. It is used for real-time data analytics and data streaming. In the proposed framework, it is used for anomaly detection in ECG signals. Once the anomaly is detected, the corresponding ECG signals of the 10 secs window are saved/streamed into the kinesis data firehose for further classification of ECG signals. These logs are saved in the S3 bucket for future reference and data repository.

Input data
The physiological signals used in this research were collected from Physionet [55]. Three datasets are used in this research: the MIT-BIH arrhythmia database [56], MIT-BIH normal sinus rhythm database [57], and BIDMC congestive heart failure database [58]. Data have three classes: ARR, CHF, and NSR, with a total of 162 records. Training data are structured into an array of two fields with data and labels. The data field is a 162x65536 matrix, where each row is a training recording sampled at 128 Hz. The sample Plot of each class is shown in Fig 6.

Improved feature engineering
In this research, carefully crafted hand-engineered features have been combined with automatically detected features from machine learning (ML) and deep learning (DL) algorithms to form a high-dimensional improved feature set. Hand-crafted features have been selected on the basis of standard clinical mean and median values of ARR, NSR, and CHF. This feature set is also helpful in generating early signs of an anomaly to the doctor-in-loop for timely treatment. The decomposition of a normal ECG signal waveform comprises a P wave, QRS complex, and T wave. These are the important parameters we located during the analysis of the detected ECG signals. To understand the cycle for better analysis, the P wave relates to the heart's atrial depolarization, and the T wave corresponds to the heart's ventricular repolarization. The interval between two consecutive R waves is known as the R-R interval, the presence of an arrhythmic heartbeat would impact the RR interval length. The QRS wave is the combination of three deflections and it represents depolarization and contraction of ventricles. The R-R interval and QRS interval as shown in Figs 7 and 8, serve as important indicators of various heart diseases, such as arrhythmia.
In Table 1 the extracted intervals and peak values have been shown, which are compared to threshold normal values (stored manually) for generating medical emergency alerts. The PR, QT, and ST intervals discussed in Table 1 indicate the depolarization of ventricles in segments.
The ECG data are received via a biosignal acquisition device and stored in the cloud. If the variation in LabVIEW-detected particular intervals is large, irregular ECG is detected.
The next variable that is part of our analysis is heart rate variability (HRV). This is a variation in time intervals between the consecutive heartbeats. HRV is regulated and controlled by the basic part of our nervous system called the autonomic nervous system (ANS). An abnormal HRV pattern leads to life-threatening cardiac diseases such as arrhythmia. HRV can be affected by other factors, including aging and gender. Fig 9 reflects the heart rate number under normal conditions.

Wavelet scattering transform
ECG time-series signals usually require higher frequency resolution than time resolution because they have a low-frequency component. However, a high time resolution is required

PLOS ONE
Cloud-based healthcare framework for ECG signals

PLOS ONE
for high-frequency components in ECG because they vary quickly with time. Therefore, a multi-resolution analysis method would analyze an ECG signal precisely if the ECG signal comprises both high-and low-frequency components. In wavelet scattering, the propagation of data is done by a series of wavelet transforms as shown in Fig 10, nonlinearities, and averaging. Therefore, it produces low-variance representations of time series.
Wavelet time scattering yields signal representations insensitive to shifts in the input signal without sacrificing class discriminability. The selection of wavelet filter type has a significant effect on the overall de-noising performance of the proposed method. The wavelet transform has made the scale discrete by using the Daubechies-4 (db4) wavelets of wavelet filters since the data is identical with an in-variance scale 15. The selection of db4 Wavelet as a measuring parameter is because of the resemblance between its scaling function and the shape of the actual ECG signal.

Anomaly detection through LSTM-AE
The extended part of this research work contributes to the system by making the design intelligent through end-to-end anomaly detection and classification of the ECG signals. An LSTM-AE-based deep neural network is implemented to detect anomalies in acquired ECG signals. To augment the training dataset, a 1 x 65536 samples long signal is divided into 200 sample chunks and fed to AE for achieving bettering accuracy. The model is implemented in Keras, and the model summary is shown in Fig 11. The model used in the classifier has a simple structure of one LSTM layer with 256 followed by a dropout layer with a rate of 0.2. Then, a repeat vector followed by the decoder level of the LSTM layer with 256 and a time-distributed layer is applied. Adam optimizer is used with mean absolute error (MAE) loss. An average MAE loss of 0.0072 for normal signals is achieved, and 0.078 is achieved for anomalous signals. Fig 12 shows the distribution of normal signal Mae loss, while Fig 13 shows the distribution of anomaly signal loss. It is shown that 98% of the normal signal loss distribution is under 0.05, while the distribution of anomaly signal Mae loss is after 0.07. Therefore, anomalies are accurately detected by setting the maximum loss threshold of 0.068. The model's ability to reconstruct normal signals and anomaly signals is shown in Fig 14. The normal signals model has much less reconstruction error loss than anomaly signals by learning/mapping the dependencies present in the input signal. The first row of signal plots in Fig 14 is for the normal signal compared with the reconstructed signal (red), and the second row of signal plots shows the anomaly signal compared with the reconstructed signal (red). It is clear from the graphs that the anomaly reconstruction error is almost 10 times greater than the normal signal reconstruction error loss. By setting reconstructed error thresholds of 0.02 in terms of MAE loss, 98% accuracy is achieved for anomaly detection. We have implemented an automatic detection approach to detect anomalies using an LSTM-AE. This deep learning-based approach enables the discovery of constraints involving long-term nonlinear associations among multivariate time-series data records and attributes.

ECG classification results
A wavelet scattering filter bank is constructed giving the length of the signal, sampling frequency, quality factor of [8,1], and default invariance scale. We extracted the features from Fig  15, using the function feature matrix, reducing down to a feature set of 499 by 8, which is a 95% reduction in the size of features. We separated 113 samples for training, and 49 samples were excluded from ECG samples for testing and randomly selected and computed the feature matrix of all of these data for both the test and training samples. The output of the training data feature set is 499x8x113. The wavelet scattering transform is critically down-sampled in time based on the bandwidth of the scaling function. In this case, this results in 8-time windows for each of the 499 scattering paths. The data were then trained by implementing 25 machine learning algorithms, including KNN, SVM, ensemble subspace discriminant, and subspace KNN, which had higher accuracy, as shown in Table 2. While evaluating this trained model on the test data, the accuracy of the predicted class will be compared with the true class accuracy. We obtained the classification results based on different numbers of feature sets.
We trained the raw signal by inputting all 499 features, and we observed SVM fine, cubic, medium Gaussian, and ensemble subspace discriminant with prominent testing accuracy.  For improved feature selection, we used the "fscmrmr" function [67], which is an automatic feature selection algorithm that ranks features for classification using the minimum redundancy maximum relevance algorithm. It calculates the scores for all the training data sets and evaluates them on what features have a greater impact and which features have less impact.
If we plot the first 60 features as shown in Fig 14, feature number one has a very high score, then the second-highest score is feature number 141. Third highest, 395, 73, and so on. Therefore, for the next training, we used the top 20 feature sets, not using the entire 499 feature set. Similarly, when there are only 20 features to be used for training, there are fewer feature sets, so the training would be very quick. We reduced the feature set to 60 samples to include only the most important features, and we observed improvements in the accuracy of most SVM-, KNN-and ensemble-based ML algorithms. By reducing the feature set up to the 20 most important samples, we were able to obtain testing accuracy up to 98%.
The three classes that we need to build a model to classify the signals are ARR, CHF, and NSR. In each class, we have approximately 96 signals for the first class and 30 signals for the second and third classes, which is the NSR, with 36 signals. Each signal is approximately 65,536 samples long. Now, the goal of this research is to find an optimal algorithm that can classify ECG signals into these three distinct categories most accurately. Therefore, applying ML techniques can reduce the dimensions of the signal automatically and extract the features while losing information about the signal itself.
The main idea is to reduce the signal dimensions while preserving the information. Therefore, DL algorithms have been implemented so that the signal does not lose any information during training. The sample signal we used earlier had 65,536 To predict class labels, the network ends with a fully connected layer, a softmax layer, and a classification output layer. Through the dropout function, some hidden outputs do not influence the training of the model. However, during testing the network dropout function will be turned off and all the hidden neurons output will affect the testing procedure. In this paper one dropout layer is used during training of the LSTM network model. We can directly feed in the signal to the DL-based LSTM network, and under a few situations, instead of feeding the signals directly to LSTM networks, we can perform feature extraction before feeding the signals into the LSTM network, if required. We can train our LSTM networks on those features to build our model in situations where feeding raw data directly into LSTMs does not work, if we have fewer data to begin with, which is typically the case for many AI-based problems, or in situations where data augmentation can be very challenging.
The samples, and the features that were extracted automatically had 499 x 8 features. Therefore, there are approximately 4,000 features in the data set. This shows a 95% reduction in the size of the features compared to the original signal. We have taken all the features, which is 499 x 8, by taking that as the whole matrix and then trained the LSTM network. We built a small LSTM network using Matlab 2020's deep network designer app., Table 3 shows the detail of the network with the total number of hidden units, number of layers, and other details. We trained the network by hyper-parameters tuning using Matlab 2020 Experiment manager. The experiment manager trained the network defined by the setup. By default, one trial is run at a time, and combinations are made in each trial. The parameters such as the activation functions are selected by default where the state activation function is "tanh" and the gate activation function is "sigmoid". The parameters for training the network were selected after finding the optimized parameters. Table 4 shows the values and evaluation of parameters such as initial learning rate, mini-batch size, optimizer, maximum epoch size, and "last" as the output mode after hyper-parameter optimization, giving 100% classification accuracy. The input side is 499 because they are the number of rows for every signal.
In Table 4 all the combinations of hyper-parameters are given. Among them, we have selected the parameters having the best validation/testing accuracy. The selected parameters are adam as an optimizer, 150 epochs, and 1000 mini-batch size gives 100% validation/testing accuracy. Different values for mini-batch size were tested ranging from 800-1500 and it showed no major difference in performance with a minor difference in training time. Therefore, we used the mini-batch size value as 1000.
The LSTM network is trained quickly, approx. in 45 seconds. To evaluate the LSTM model, we extracted all the features from the test signals. This model yields 100% accuracy, as shown in Fig 16. Thus, LSTM has proven to be the optimal technique without the limitation of losing information and being accessible to noise due to reduction in the feature set, being a quick solution for raw 1D ECG data classification. In order to validate our algorithm, the accuracy and loss curves have been plotted in Fig 17. This figure shows the accuracy and loss curves of training and validation data. A maximum accuracy of 100% has been obtained on the 50th iteration. Similarly, the loss curve shows a continuous decay after the 40th iteration.
We have also plotted the receiver operating characteristic (ROC) curves for evaluating the performance of our model with other ML models. The models in Table 2   For the final evaluation of our implemented model, we have plotted the precision-recall curve (PRC). The PRC has emerged as a good indicator of a classifier's performance. A PRC of LSTM is shown in Fig 21 which shows a high area under the curve for both recall and precision values. A high precision is associated with a low false positive rate, similarly, a high recall is linked to a low false negative rate.
Out of all evaluated classifiers in Table 2, the classifiers having better testing accuracy have been evaluated using other metrics including precision, recall, and F1 score. On the basis of all these parameters, Table 5 incorporates the comparison. Table 6 summarises the testing evaluation scores of classifiers with respect to each class individually. According to that our method outclasses other methods with 100% precision, recall, and F1 score for each class.
In Table 7, our method has been compared with existing literature and the results show that our approach outclasses others in accuracy, precision, recall, and F1 score.

Conclusion
In this work, an AWS cloud-based framework has been designed and implemented for automatic anomaly detection and classification of 1D ECG signals. The anomaly detection of raw   ECG signals has been implemented using an LSTM auto-encoder. The results show a reconstructed error threshold of 0.02 in terms of MAE loss with 98% accuracy for anomaly detection. The proposed system is effective and time-efficient for large datasets. The wavelet time scattering feature extraction method has achieved 95% signal reduction to increase the realtime processing speed. In addition to anomaly detection, this work provides an extensive comparative study of state-of-the-art techniques to identify the optimum solution for the classification of ECG signals.
An improved feature set has been developed in this study by combining the carefully crafting hand-engineered features with automatically detected features through AI-based algorithms to make the system more effective.
The comparative evaluation has shown that our approach, the deep-learning-based algorithm LSTM has shown 100% testing accuracy, F1 score, Precision and Recall for multi-class ECG signal giving in the raw 1D ECG data with 95% reduction in signal length to make the proposed system efficient.
In the future, we would work on improving the automatic ECG anomaly detection and classification system, by making it patient-specific to challenge the classifier for temporal and morphological variations in different patients.
The framework can be extended to real-time analysis platforms such as spark streaming or Apache Kafka open-source solutions.
The limitation of this study is that the proposed system was not validated against the hospital patient data. We have only used the publicly available dataset.